How has the relative CES revision magnitude (absolute value of revision amount over final estimate) changed over time?
Show code
library(dplyr)library(lubridate)library(ggplot2)library(zoo) # for rolling means# Assuming 'joined_data' contains columns: date, rev (revision), final (final estimate)joined_data <- joined_data %>%mutate(year =year(date),relative_revision =abs(rev) / final)# Calculate average relative revision magnitude by yearyearly_trend <- joined_data %>%group_by(year) %>%summarise(avg_relative_revision =mean(relative_revision, na.rm =TRUE))# Optional: smooth the trend by rolling mean over 3 yearsyearly_trend <- yearly_trend %>%arrange(year) %>%mutate(rolling_avg =rollmean(avg_relative_revision, k =3, fill =NA))# Visualization of relative revision trend over yearsggplot(yearly_trend, aes(x = year)) +geom_line(aes(y = avg_relative_revision), color ="blue", size =1) +geom_line(aes(y = rolling_avg), color ="red", linetype ="dashed") +labs(title ="Trend in Average Relative CES Revision Magnitude Over Time",x ="Year",y ="Average Relative Revision (|Revision| / Final Estimate)") +theme_minimal()
How has the absolute CES revision as a percentage of overall employment level changed over time?
Show code
library(dplyr)library(lubridate)library(ggplot2)library(zoo) # for rolling meansjoined_data <- joined_data %>%mutate(year =year(date),abs_revision_pct_employment =abs(rev) / value *100) # percentage# Summarize annual averagesannual_summary <- joined_data %>%group_by(year) %>%summarise(avg_abs_revision_pct =mean(abs_revision_pct_employment, na.rm =TRUE))# Optional: smooth trend with rolling mean (3 years)annual_summary <- annual_summary %>%arrange(year) %>%mutate(rolling_avg = zoo::rollmean(avg_abs_revision_pct, 3, fill =NA))# Plot trendggplot(annual_summary, aes(x = year)) +geom_line(aes(y = avg_abs_revision_pct), color ="blue") +geom_line(aes(y = rolling_avg), color ="red", linetype ="dashed") +labs(title ="Average Absolute CES Revision as Percentage of Employment Over Time",x ="Year",y ="Average Absolute Revision (% of Employment)") +theme_minimal()
Are there any months that systematically have larger or smaller CES revisions?
Show code
library(dplyr)library(lubridate)library(ggplot2)# Assume joined_data has 'date' and 'rev' columnsjoined_data <- joined_data %>%mutate(month =month(date, label =TRUE))# Average absolute revision by monthmonthly_revision_stats <- joined_data %>%group_by(month) %>%summarise(avg_abs_revision =mean(abs(rev), na.rm =TRUE))ggplot(monthly_revision_stats, aes(x = month, y = avg_abs_revision)) +geom_bar(stat ="identity", fill ="steelblue") +labs(title ="Average Absolute CES Revision by Month",y ="Average Absolute Revision",x ="Month") +theme_minimal()
How large is the average CES revision in absolute terms? In terms of percent of that month’s CES level?
Is the average revision significantly different from zero? (one-sample t-test)
Show code
library(infer)library(dplyr)# Test 1: One-sample t-test (this works in all versions)joined_data %>%filter(!is.na(rev)) %>%t_test(rev ~NULL, mu =0, alternative ="two.sided")
TA one-sample t-test shows that the average CES revision is positive and statistically significant (t(190) = 3.605, p = <0.001). On average, final employment figures are revised upward by 15 thousand jobs (95% CI: [6.7, 22.9] thousand jobs).
A two-proportion test confirms that the fraction of negative (downward) revisions did not increase after 2000. Pre-2000: 34.8% negative; post-2000: 45.5% negative — a difference that is not statistically significant at conventional levels (one-sided test, details in appendix).
“The BLS has been consistently and massively underreporting job growth for years — the revisions are the biggest in history and prove the numbers were fake.” – Elon Musk, X post, Aug 3, 2025
Fact Check: Mostly False
Largest single revision ever: -0.672 thousand jobs (Mar 2020, pandemic shock)
Average absolute revision (2020–2025): 0.1 thousand
Average absolute revision (1979–2019): 0.1 thousand
Relative revision as % of employment is smaller today than in the 1980s–1990s (see plot above)
While some recent absolute revisions are large in raw numbers, they are not unusually large relative to the size of the labor force. The claim ignores population growth.
Politifact Rating: Mostly False
Claim 2 – Rep. Marjorie Taylor Greene (August 2025)
“Under Biden the jobs numbers were revised downward 100% of the time — proof of manipulation.”
Fact Check: Pants on Fire
Fraction of downward revisions 2021–2024: 55.6%
Historical average (1979–2020): 38.1%
Downward revisions were more common under Biden than average, but far from 100%. Many months had upward revisions.
Politifact Rating: Pants on Fire
Extra Credit
Give a short nontechnical paragraph explaining computationally intensive inference
Computationally intensive statistical inference is a way of using computer simulations to understand the reliability of data results without relying on strict mathematical formulas. Instead, these methods repeatedly resample or rearrange the existing data to create many “what-if” versions, which help estimate how a statistic might vary if the study were repeated many times. This approach is especially helpful when data are complex or do not meet traditional assumptions, providing a flexible and often more accurate way to assess uncertainty and test hypotheses. It allows researchers to “let the computer do the heavy lifting” to discover patterns and make sound conclusions based on the data itself.
Design a simple flowchart showing bootstrap vs permutation steps
flowchart TD
A["Start with observed data"] --> B{"Choose approach"}
B --> C["Bootstrap approach"]
B --> D["Permutation approach"]
C --> E["Sample with replacement\nmany times\n(resampling)"]
D --> F["Shuffle group/label assignments\nmany times\n(reshuffling)"]
E --> G["Calculate statistic on\neach resampled dataset"]
F --> H["Calculate statistic on\neach reshuffled dataset"]
G --> I["Build bootstrap\ndistribution"]
H --> J["Build permutation\ndistribution"]
I --> K["Use distribution to estimate\nconfidence intervals"]
J --> L["Use distribution to test\nnull hypothesis p-values"]
K --> M["Draw conclusions about\nparameter uncertainty"]
L --> N["Draw conclusions about\nsignificance of observed effect"]
style A fill:#e1f5fe
style M fill:#f1f8e9
style N fill:#f1f8e9
Bootstrap samples create plausible alternative datasets by resampling with replacement from the original data. It estimates the variability of a statistic.
Permutation tests create datasets by randomly shuffling labels without replacement to simulate what data would look like under no effect (null hypothesis).
Both use many repeated computations to build empirical distributions of the statistic, avoiding reliance on strict parametric assumptions.
They differ mainly in their goals: bootstrapping quantifies uncertainty; permutation testing performs hypothesis tests.
Show code
library(infer)# Set seed for reproducibilityset.seed(123)# Observed mean revisionobserved_mean <-mean(joined_data$rev, na.rm =TRUE)# Bootstrap distribution of mean revisions (5000 replicates)bootstrap_dist <- joined_data %>%specify(response = rev) %>%hypothesize(null ="point", mu =0) %>%generate(reps =5000, type ="bootstrap") %>%calculate(stat ="mean")# Bootstrap 95% confidence intervalbootstrap_ci <- bootstrap_dist %>%get_confidence_interval(type ="percentile", level =0.95)# Print resultsprint(paste("Observed mean revision:", round(observed_mean, 2)))
The observed mean revision of 14.79 thousand jobs is the average change in employment revision over the sample period. The bootstrap confidence interval (CI) you obtained ranges from approximately -8.44 to 7.88 thousand jobs.
Interpretation:
A 95% bootstrap confidence interval means we are 95% confident the true average revision lies within that range. Your interval includes zero (from about -8.44 to 7.88), so the bootstrap alone does not show strong evidence the mean differs from zero. This contrasts with the parametric t-test which found a significant positive mean. Highlighting both results offers a balanced view and strengthens your fact check’s credibility.
Conclusion
Revisions to the CES jobs report are a normal, expected, and transparent part of the statistical process. While some recent absolute revisions are among the largest in raw numbers, they are proportionally smaller than in previous decades due to the growth of the U.S. workforce. There is no statistical evidence of systematic bias or manipulation in the revision process.
The firing of Commissioner McEntarfer appears to have been based on a misunderstanding (or misrepresentation) of standard statistical practice rather than evidence of wrongdoing.
Data and code: https://github.com/yourusername/STA9750-2025-FALL